Solution: Use NULL as a Unique Value

Let’s see how using NULL as a unique value is the best solution.

Most problems with NULL values are based on a common misunderstanding of the behavior of SQL’s three-valued logic. This can be a challenge for programmers accustomed to the conventional true/false logic implemented in most other languages. But we can easily handle NULL values in SQL queries after a little study of how they work.

NULL in scalar expressions#

Let’s suppose Stan is thirty years old while Oliver’s age is unknown. If you were asked whether Stan is older than Oliver, your only possible answer would be “I don’t know.” If you were asked whether Stan is the same age as Oliver, your answer would still be “I don’t know.” And if you were asked to tell the sum of Stan and Oliver’s ages, your answer would be just the same: “I don’t know.”

Are NULLs Rational?

Now, let’s suppose that Charlie’s age is also unknown. If you were asked whether Oliver’s age is equal to Charlie’s age, your answer would still be “I don’t know.” This shows why the result of a comparison like NULL = NULL is also NULL.

The following table describes some cases where programmers expect one result but get something different.

Expression Expected Actual Reason
NULL = 0 TRUE NULL Null is not zero.
NULL = 12345 FALSE NULL Unknown if the unspecified value is equal to a given value.
NULL <> 12345 TRUE NULL Also unknown if it’s unequal.
NULL + 12345 12345 NULL Null is not zero.
NULL || 'string' ‘string’ NULL Null is not an empty string.
NULL = NULL TRUE NULL Unknown if one unspecified value is the same as another.
NULL <> NULL FALSE NULL Also unknown if they’re different.

Of course, these examples apply not only when using the NULL keyword but also to any column or expression whose value is null.

NULL in boolean expressions#

NULL is neither true nor false. A null value certainly isn’t true, but it isn’t the same as false either. If it were, then applying NOT to a NULL value would result in true. But that’s not the way it works; NOT (NULL) results in another NULL. This confuses some people who try to use boolean expressions with NULL.

The table “NULL in Boolean Expressions” describes some cases where programmers expect one result but get something different.

Expression Expected Actual Reason
NULL AND TRUE FALSE NULL Null is not false.
NULL AND FALSE FALSE FALSE Any truth value AND FALSE is false.
NULL OR FALSE FALSE NULL Null is not false.
NULL OR TRUE TRUE TRUE Any truth value OR TRUE is true.
NOT (NULL) TRUE NULL Null is not false.

The Right Result for the Wrong Reason

Searching for NULL#

Since neither equality nor inequality returns true when comparing one value to a NULL value, we need some other operation to search for a NULL value. Older SQL standards define the IS NULL predicate, which returns true if its single operand is NULL. The opposite, IS NOT NULL, returns false if its operand is NULL.

Let’s look for the Bugs table values where the value for assigned_to is NULL. Let’s see what happens if we press “RUN” in the following playground.

Searching for NULL

Now, let’s check the same thing for NOT NULL in the following playground.

Searching for not NULL

The SQL-99 standard defines another comparison predicate, IS DISTINCT FROM. This works like an ordinary inequality operator <>, except that it always returns true or false, even when its operands are null. Unfortunately, this predicate is not available in any MySQL versions now. It’s true that it used to relieve us from writing tedious expressions to test IS NULL before comparing to a value. The following two queries are equivalent; let’s check them one by one. The first query is using IS NULL to check for entries having NULL values.

Searching for NULL using IS NULL

The following query uses <> to search for NULL in the column assigned_to.

We can use this predicate with query parameters to which we want to send either a literal value like 1 or NULL:

Try replacing the question mark with some literal value first, like the literal 1, to see the processing of the query in the playground below.

Searching for NULL using <>

Support for IS DISTINCT FROM is inconsistent among database brands. PostgreSQL, IBM DB2, and Firebird do support it, whereas Oracle and Microsoft SQL Server don’t support it yet. MySQL offers a proprietary operator <=> that works like IS NOT DISTINCT FROM.

Declare columns NOT NULL#

It’s recommended to declare a NOT NULL constraint on a column for which a NULL would break a policy in our application or otherwise be nonsensical. It’s better to allow the database to enforce constraints uniformly rather than rely on application code.

For example, it’s reasonable to assume that any entry for the date_reported, reported_by, and status columns in the Bugs table would have a non-NULL value. Likewise, rows in child tables like Comments must also include non-NULL bug_id values, as they reference existing bugs. We should declare these columns with the NOT NULL option.

Some people recommend that we define a DEFAULT for every column so that if we omit the column in an INSERT statement, the column gets the DEFAULT value instead of null. That’s good advice for some columns but not for other columns. For example, Bugs.reported_by should not be NULL But what default, if any, should we declare for this column? It’s valid and common for a column to need a NOT NULL constraint and yet have no logical default value.

Dynamic defaults#

In some queries, we may need to force a column or expression to be non-NULL for the sake of simplifying the query logic, but we don’t want its value to be stored. What we need is a way to set a default value for a given column or expression but only in an ad hoc way for specific queries. For this, we can use the COALESCE() function. This function accepts a variable number of arguments and returns its first non-NULL argument.

Returning to the story about concatenating users’ names in the beginning of this chapter, you could use COALESCE() to make an expression that uses a single space in place of the middle initial, so a NULL-valued middle initial doesn’t make the whole expression become NULL.

Using COALESCE function to store values

COALESCE() is a standard SQL function. Some database brands support a similar function by another name, such as NVL() or ISNULL().

Antipattern: Use Null as an Ordinary Value, or Vice Versa
Synopsis: Ambiguous Groups
Mark as Completed
Report an Issue